Background

In February 2020 the afrimapr team set out to develop building blocks in R that would make open health facility data more accessible to data scientists in Africa and elsewhere. The afrihealthsites package aims to provide functionality to load, analyse, visualise, and map open health facility datasets such as the one compiled by KEMRI-Wellcome for sub-Saharan Africa and the data made available via the Global Healthsites Mapping Project (healthsites.io).

Through our research we learned about the term master facility list (MFL). A master facility list contains information about the full complement of health facilities in a country. The World Health Organisation developed a guide for countries wanting to develop their own MFL or wanting to strengthen existing MFLs.

We were excited to find several African MFLs available online. This work forms part of a research paper that we’re working on. We wanted to perform some exploratory analysis on the MFLs from different countries to understand the overlaps and differences in terms of information that is made available, data format, and more.

Obtaining the data

Kenya

The Kenyan MFL is available at http://kmhfl.health.go.ke/#/home. The data is downloadable in Excel format (although there seem to be an API as well, but we did not use the API). Unfortunately one has to visit the website and physically click on the Export Excel button to obtain the data. Once downloaded to our data/raw_data folder, the data is easily loaded using the function read_xlsx from the read_xl package.

Code Name Officialname Registration_number Keph level Facility type Facility_type_category Owner Owner type Regulatory body Beds Cots County Constituency Sub county Ward Operation status Open_whole_day Open_public_holidays Open_weekends Open_late_night Service_names Approved Public visible Closed
25720 Itete Dispensary Itete Dispensary 01245 Level 2 Dispensary DISPENSARY Ministry of Health Ministry of Health Ministry of Health 8 3 Kakamega Matungu Matungu Koyonzo Operational No No No No NA Yes Yes No
25731 Highrise Healthcare Services Highrise Healthcare Services 017325 Level 2 Medical Clinic MEDICAL CLINIC Private Practice - Nurse / Midwifery Private Practice Kenya MPDB 0 0 Embu Mbeere South Mbeere South Mbeti South Operational No No No No NA Yes Yes No
Note: An excerpt showing the column headers and format of the raw data available from the Kenyan MFL

Malawi

The Malawi MFL is available at http://zipatala.health.gov.mw/facilities and can be downloaded in Excel or PDF format. A API exist but more information was not available so we visited the website and downloaded the data to our data/raw_data folder by clicking on the DOWNLOAD EXCEL button.

CODE NAME COMMON NAME OWNERSHIP TYPE STATUS ZONE DISTRICT DATE OPENED LATITUDE LONGITUDE
MC010002 A + A private clinic A+A Private Clinic Functional Centrals West Zone Mchinji Jan 1st 75 -13.797421 33.885631
BT240003 A-C Opticals A.C Opticals Private Clinic Functional South East Zone Blantyre Jan 1st 75 -15.8 35.03
Note: An excerpt showing the column headers and format of the raw data available from the Malawian MFL

Namibia

The MFL for Namibia is accessible via an API as described on the website. The data can also be downloaded in Excel format directly from the website, but it should be noted that the resultant Excel file contains a very small subset of the total attributes available.

We had some trouble reading the JSON file in R and decided to develop a script in Python that could access the JSON for each facility and convert the dataset to an object that could further be analysed here in R alongside the other country MFLs.

Details of the data structure and download process are available from the Jupyter Notebook.

address alt_phone_number catchment_population contact_person id infrastructures location_ownership location_type long_name name parent_location phone_number point_x point_y services
NA 0 NA NA 11981 [] {‘name’: ‘Public_MoHSS’} {‘name’: ‘Facility’} NA Zambezi Regional Health Office {‘id’: 10598, ‘name’: ‘Katima Mulilo District’} NA -17.4994 24.27878 []
NA 0 4111 NA 10131 [{‘id’: 1, ‘name’: ‘Ambulances’}, {‘id’: 2, ‘name’: ‘Beds’}, {‘id’: 5, ‘name’: ‘Electricity’}, {‘id’: 6, ‘name’: ‘Running Water’}, {‘id’: 7, ‘name’: ‘Health Extension Workers’}, {‘id’: 9, ‘name’: ‘Toilets’}, {‘id’: 11, ‘name’: ‘Phone Number’}, {‘id’: 13, ‘name’: ‘Computers’}, {‘id’: 14, ‘name’: ‘Vehicles’}, {‘id’: 16, ‘name’: ‘Enrolled Nurses’}, {‘id’: 17, ‘name’: ‘Registered Nurses’}, {‘id’: 19, ‘name’: ‘Doctors’}, {‘id’: 20, ‘name’: ‘Administrative Officers’}] {‘name’: ‘Public_MoHSS’} {‘name’: ‘Facility’} NA Sibbinda Health Centre {‘id’: 10598, ‘name’: ‘Katima Mulilo District’} NA -17.7851 23.82119 [{‘id’: 1, ‘name’: ‘HIV Testing Services’}, {‘id’: 2, ‘name’: ‘General Clinical Service’}, {‘id’: 3, ‘name’: ‘Expanded Programme on Immunizations’}, {‘id’: 8, ‘name’: ‘Preventing Mother To Child Transmission Services’}, {‘id’: 31, ‘name’: ‘Viral Load Testing’}, {‘id’: 32, ‘name’: ‘Sexual Transmitted Infections’}, {‘id’: 36, ‘name’: ‘Anti Retroviral Therapy IMAI Site’}, {‘id’: 38, ‘name’: ‘Ante Natal Clinic Services’}, {‘id’: 39, ‘name’: ‘Family Planning Services’}, {‘id’: 41, ‘name’: ‘Tuberculosis Services’}, {‘id’: 52, ‘name’: ‘Option B+’}, {‘id’: 53, ‘name’: ‘DNA EID Testing’}]
Note: An excerpt showing the column headers and format of the raw data available from the Namibian MFL

Rwanda

Rwanda makes their MFL available in CSV, Excel or PDF format. Again one has to visit the provided webpage anc physically click on the CSV button to download the data.

The raw data contains two instances of the District column that seems to be a duplicate in terms of values stored in this column.

Facity Name id Opening Date Sector Subdistrict District Province District_1 Facility type Ownership LOCATION
A La Source DISP 1140 1/1/2000 Muhima Muhima Sub District Nyarugenge District Kigali City Nyarugenge District Dispensary Private -1.937517 30.059391
Active Life Physiotherapy ltd 1664 8/17/2016 Kimironko Kibagabaga Sub District Gasabo District Kigali City Gasabo District Medical Clinic Private NA
Note: An excerpt showing the column headers and format of the raw data available from the Rwandan MFL

South Sudan

The South Sudan facilities list is available in CSV format from https://www.southsudanhealth.info/facility/fac.php?list. The data can be accessed directly in CSV format via the link - https://www.southsudanhealth.info/PublicData/facility_info_2020-05-08.csv.

#“idGeo” Facility type Ext Ref Payam County State Deleted Indicators Sampled Pilot Accessible Pilot Operational Extension Accessible Extension Operational Alternate Names Location ACLED Refs
1 140th SPLA Battalion Other FC10080402 Tambura Tambura County Gbudwe NA 0 0 0 0 0 0 NA 5.52066, 27.46684 NA
2 Abara PHCC Primary Health Care Centre FC02070702 Unknown Payam In Magwi Magwi County Imatong NA 130 1 0 0 1 1 abara-phcc, Ababa PHCC 4.08234, 32.17893 NA
Note: An excerpt showing the column headers and format of the raw data available from the South Sudan MFL

Tanzania

For Tanzania the MFL is available at http://hfrportal.moh.go.tz/index.php?r=page/index&page_name=about_page with data downloadable in Excel format (XLS). The geocoded data can directly be accessed via a URL with no need for physically interacting with the website. It should be noted that the data can be cashed as empty dataset. If the downloaded file contains no data, please visit the website and ensure all geocoded facilities are selected.

There are also 1,378 facilities without coordinates in this database. These can be downloaded by visiting this URL.

Thankfully these two datasets contain the same columns (Latitude and Longitude is retained in the non-geocoded file). We can therefore merge the two datasets easily for combined analysis.

Unfortunately, the very first row of the Excel sheet is made up of merged cells. The row contains information about the date and time of download of the data and can be deleted. Because of the merged cells, the data has to be downloaded to disk, opened in Excel or LibreOffice or other spreadsheet package. The first row has to be removed and the file saved. Only after this can the file be loaded successfully in R.

Facility Number Facility Name Common Name Registration Status Created At Updated At Zone Region District Council Ward Village/Street Facility Type Operating Status Ownership Registration Number CTC Number Latitude Longitude Date Opened National Grid Generator Solar Panels No Electricity Other
113310-7 2001 GEM PLUS NA Registered 2019-03-27T18:14:11.000Z 2019-05-28T17:32:43.000Z Lake Zone Mwanza Nyamagana Nyamagana MC Butimba Not set Health Labs - Level IA2 (Dispensary Laboratory) Operating Private - For Profit PHL-C/MWZ/AUT/06 NA -2.563525 32.91266 NA 0 0 0 0 0
100017-3 202 KJ NA NA 2013-08-20T10:23:40.000Z 2018-08-20T11:36:15.000Z Central Zone Not set Not set Not set Not set Not set Dispensary Closed Public - Military NA NA -5.057160 32.82869 NA 0 0 0 0 0
Note: An excerpt showing the column headers and format of the raw data available from the Tanzanian MFL

Zambia

The Zambian MFL is hosted on Github. The raw data is available in CSV format in the Github repository.

province district name HMIS_code DHIS2_UID smartcare_GUID eLMIS_ID iHRIS_ID location ownership facility_type longitude latitude catchment_population_head_count catchment_population_cso operation_status
Central Chibombo Chamakubi Health Post 10010001 pXhz0PLiYZX 7b46450b78a04a1db64c0fc9bb014773 NA facility|1 Rural GRZ Health Post 27.64199 -14.79990 6624 6624 Operational
Central Chibombo Kabangalala Rural Health Centre 10010011 sbFApO4who4 9a450380b7db4f2fb13156d03fc0bc6d NA facility|10 Rural GRZ Rural Health Centre 27.72866 -15.16894 6345 6900 Operational
Note: An excerpt showing the column headers and format of the raw data available from the Zambian MFL

Loading other open health facility data sets

We can also access the open health facility data available through the KEMRI-Wellcome group and healthsites.io. Both these datasets can be accessed via the afrimapr afrihealthsites package.

Below are excerpts from the WHO and healthsites.io data for Kenya to give the reader an overview of column headers and data format.

Kenya: WHO dataset

Country Admin1 Facility name Facility type Ownership Lat Long LL source iso3c
Kenya Baringo Aiyebo Dispensary Dispensary MoH 0.65783 35.80768 GPS KEN
Kenya Baringo Akwichatis Health Centre Health Centre MoH 1.00150 36.23620 GPS KEN
Note: An excerpt showing the column headers and format of the raw data available from the WHO data for Kenya

Kenya: healthsites.io dataset

osm_id osm_type completeness is_in_health_zone amenity speciality addr_full operator water_source changeset_id insurance staff_doctors contact_number uuid electricity opening_hours operational_status source is_in_health_area health_amenity_type changeset_version emergency changeset_timestamp name staff_nurses changeset_user wheelchair beds url dispensing healthcare operator_type geometry country iso3c
696655697 node 27 pharmacy 62793048 37fa2725b7824f60ad7ba9f4103ccb06 08:00-20:00 operational survey 7 1537524419 Nafuu Chemist cbeddow yes private c(36.778198187095, -1.31241153440629) Kenya KEN
6807606134 node 10 clinic 74663883 87c9e47944eb4ac5ac72daf6d7ac2f86 1 1568884347 Arap Kobilo yes c(35.9799016217169, 0.468861661533481) Kenya KEN
Note: An excerpt showing the column headers and format of the raw data available from the healthsites.io data for Kenya

Exploring the data

Facility Attributes

We’ll first take a look at the type of attributes that are available from country MFLs and other open data sources. The table below shows great variability in terms of how well facililties are described in the various datasets.

healthsite.io WHO Kenya MFL Malawi MFL Namibia MFL Rwanda MFL South Sudan MFL Tanzania MFL Zambia MFL
addr_full Admin1 Approved CODE address District #“idGeo” Common Name catchment_population_cso
amenity Country Beds COMMON NAME alt_phone_number District_1 ACLED Refs Council catchment_population_head_count
beds Facility name Closed DATE OPENED catchment_population Facility type Alternate Names Created At DHIS2_UID
changeset_id Facility type Code DISTRICT contact_person Facity Name County CTC Number district
changeset_timestamp iso3c Constituency LATITUDE id id Deleted Date Opened eLMIS_ID
changeset_user Lat Cots LONGITUDE infrastructures LOCATION Ext Ref District facility_type
changeset_version LL source County NAME location_ownership Opening Date Extension Accessible Facility Name HMIS_code
completeness Long Facility type OWNERSHIP location_type Ownership Extension Operational Facility Number iHRIS_ID
contact_number Ownership Facility_type_category STATUS long_name Province Facility Facility Type latitude
country Keph level TYPE name Sector Indicators Generator location
dispensing Name ZONE parent_location Subdistrict Location Latitude longitude
electricity Officialname phone_number Payam Longitude name
emergency Open_late_night point_x Pilot Accessible National Grid operation_status
geometry Open_public_holidays point_y Pilot Operational No Electricity ownership
health_amenity_type Open_weekends services Sampled Operating Status province
healthcare Open_whole_day State Other smartcare_GUID
insurance Operation status type Ownership
is_in_health_area Owner Region
is_in_health_zone Owner type Registration Number
iso3c Public visible Registration Status
name Registration_number Solar Panels
opening_hours Regulatory body Updated At
operational_status Service_names Village/Street
operator Sub county Ward
operator_type Ward Zone
osm_id
osm_type
source
speciality
staff_doctors
staff_nurses
url
uuid
water_source
wheelchair